Search | Global Index Medicus

Classification of multi-class homo-oligomer based on a novel method of feature extraction from protein primary structure / 生物医学工程学杂志

Shaowu ZHANG; Quan PAN; Chunhui ZHAO; Yongmei CHENG.

Journal of Biomedical Engineering ; (6): 721-726, 2007.

Article in Chinese | WPRIM | ID: wpr-346084

ABSTRACT

A novel method of feature extraction from protein primary structure has been proposed and applied to classify the protein homodimer, homotrimer, homotetramer and homohexamer, i. e. one protein sequence can be represented by a feature vector composed of amino acid compositions and a set of weighted auto-correlation function factors of amino acid residue index. As a result, high classification accuracies are obtained. For example, with the same support vector machine (SVM), the total accuracies of QIANA, AIANB, MEEJ, ROBB and SNEP sets based on this novel feature extraction method are 77.63, 77.16, 76.46, 76.70 and 75.06% respectively in Jackknife test, which are 6.39, 5.92, 5.22, 5.46 and 3.82 percent points respectively higher than that of COMP set based on the conventional method composed of amino acid compositions. With the same QIANA set, the total accuracy of SVM is 77.63%, which is 16.29 percent points higher than that of covariant discriminant algorithm. These results show: (1) The novel feature extraction method is effective and feasible, and the feature vectors based on this method may contain more protein quaternary structure information and appear to capture essential information about the composition and hydrophobicity of residues in the surface patches buried in the interfaces of associated subunits; (2) SVM can be referred as a powerful computational tool for classifying the homo-oligomers of proteins.

Subject(s)

Humans , Algorithms , Amino Acid Sequence , Artificial Intelligence , Cluster Analysis , Models, Molecular , Molecular Sequence Data , Protein Conformation , Proteins , Chemistry , Classification , Sequence Analysis, Protein , Methods

Protein Fold Recognition With Support Vector Machines Fusion Network / 生物化学与生物物理进展

Jianyu SHI; Quan PAN; Shaowu ZHANG; Yan LIANG.

Progress in Biochemistry and Biophysics ; (12)2006.

Article in Chinese | WPRIM | ID: wpr-586053

ABSTRACT

One of the important approaches to structure analysis is protein fold recognition, which is oftenapplied when there is no significant sequence similarity between structurally similar proteins. A framework with athree-layer support vector machines fusion network (SFN) is presented. The framework is applied to 27-classprotein fold recognition from primary structure of proteins. SFN uses support vector machines as memberclassifiers, and adopts All-Versus-All as multi-class categorization. Six groups of features are divided into majorand minor ones by SFN, and several diversity fusion schemes are correspondingly built. The final decision is madeby dynamic selection of the results of all fusion schemes. When it is still difficult to know what kind of fusion offeature groups can achieve good prediction,SFN is a dependable solution by selecting the optimal fusion offeature groups automatically, which can ensure the best recognition. Overall recognition system achieves 61.04%fold prediction accuracy on the independent test dataset. The results and the comparison with other approachesdemonstrate the effectiveness of SFN, and thus encourage its further exploration.

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL